Sentiment-Based Ranking of Blog Posts Using Rhetorical Structure Theory

نویسندگان

  • José M. Chenlo
  • Alexander Hogenboom
  • David E. Losada
چکیده

Polarity estimation in large-scale and multi-topic domains is a difficult issue. Most state-of-the-art solutions essentially rely on frequencies of sentiment-carrying words (e.g., taken from a lexicon) when analyzing the sentiment conveyed by natural language text. These approaches ignore the structural aspects of a document, which contain valuable information. Rhetorical Structure Theory (RST) provides important information about the relative importance of the different text spans in a document. This knowledge could be useful for sentiment analysis and polarity classification. However, RST has only been studied for polarity classification problems in constrained and small scale scenarios. The main objective of this paper is to explore the usefulness of RST in largescale polarity ranking of blog posts. We apply sentence-level methods to select the key sentences that convey the overall on-topic sentiment of a blog post. Then, we apply RST analysis to these core sentences in order to guide the classification of their polarity and thus to generate an overall estimation of the document’s polarity with respect to a specific topic. Our results show that RST provides valuable information about the discourse structure of the texts that can be used to make a more accurate ranking of documents in terms of their estimated sentiment in multi-topic blogs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rhetorical Structure Theory for polarity estimation: An experimental study

Sentiment Analysis tools often rely on counts of sentiment-carrying words, ignoring structural aspects of content. Natural Language Processing has been fruitfully exploited in Text Mining, but advanced discourse processing is still non pervasive for mining opinions. Some studies, however, extracted opinions based on the discursive role of text segments. The merits of such computationally intens...

متن کامل

Multiple Ranking Strategies for Opinion Retrieval in Blogs - The University of Amsterdam at the 2006 TREC Blog Track

We describe our participation in the Opinion Retrieval task at TREC 2006. Our approach to identifying opinions in blog post consisted of scoring the posts separately on various aspects associated with an expression of opinion about a topic, including shallow sentiment analysis, spam detection, and link-based authority estimation. The separate approaches were combined into a single ranking, yiel...

متن کامل

Sentiment classification of blog posts using topical extracts

Unlike news stories and product reviews which usually have a strong focus on a single topic, blog posts are often unstructured, and opinions expressed in blog posts do not necessarily correspond to a specific topic. This can lead to unsatisfactory performance of sentiment classification. In this paper we report our pilot study on addressing topic drift in blogs. We examine this phenomenon by ma...

متن کامل

Blog Link Classification

Blog links raise three key questions: Why did the author make the link, what exactly is he pointing at, and what does he feel about it? In response to these questions we introduce a link model with three fundamental descriptive dimensions where each dimension is designed to answer one question. We believe the answers to these questions can be utilized to improve search engine results for blogs....

متن کامل

Targeting Sentiment Expressions through Supervised Ranking of Linguistic Configurations

User generated content is extremely valuable for mining market intelligence because it is unsolicited. We study the problem of analyzing users’ sentiment and opinion in their blog, message board, etc. posts with respect to topics expressed as a search query. In the scenario we consider the matches of the search query terms are expanded through coreference and meronymy to produce a set of mentio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013